Module name: |
FRX |
Module identifier: |
IG_REC_RM_FRX |
Filling methods supported: |
IG_REC_FM_OMNIFONT |
Filters supported: |
all filter elements |
Trade-off supported: |
none |
Knowledge base files: |
none |
Training supported: |
yes |
The OMNIFONT_PLUS2W, and OMNIFONT_PLUS3W recognition modules require the presence of this module.
Its associated files are:
baltic.shp |
Frx shape pack (code page) file. |
cyrillic.shp |
Frx shape pack (code page) file. |
greek.shp |
Frx shape pack (code page) file. |
latin1.shp |
Frx shape pack (code page) file. |
latin2.shp |
Frx shape pack (code page) file. |
turkish.shp |
Frx shape pack (code page) file. |
charsettable.chr |
|
asciieng.lng |
Frx language dictionary. Used in case of multi-language selection. |
czech.lng |
Frx language dictionary data file. |
danish.lng |
Frx language dictionary data file. |
dutch.lng |
Frx language dictionary data file. |
english.lng |
Frx language dictionary data file. |
finnish.lng |
Frx language dictionary data file. |
french.lng |
Frx language dictionary data file. |
german.lng |
Frx language dictionary data file. |
greek.lng |
Frx language dictionary data file. |
hungar.lng |
Frx language dictionary data file. |
italian.lng |
Frx language dictionary data file. |
norsk.lng |
Frx language dictionary data file. |
polish.lng |
Frx language dictionary data file. |
port.lng |
Frx language dictionary data file. |
russian.lng |
Frx language dictionary data file. |
spanish.lng |
Frx language dictionary data file. |
swedish.lng |
Frx language dictionary data file. |
turkish.lng |
Frx language dictionary data file. |
Application Areas
This module recognizes machine printed text; i.e., from printed publications, laser or ink-jet printers, and electric typewriters. Output from mechanical typewriters in good condition may also be acceptable. It should also be used for letter or near letter quality (NLQ, LQ) output from dot-matrix printers.
Range of Characters
This module supports the recognition of Latin, Greek, and Cyrillic alphabets with enough accented letters to recognize the 54 languages.
The characters are listed in category and alphanumeric order, together with their Code Page values, in Characters and Code Pages.
Multi-Lingual Language Support
The language support of this module is based on the module's internal code pages, which contain characters from a related group of languages. The internal code pages of this module are American/European (Latin 1, 1252), Baltic (1257), Central-European (Latin 2, 1250), Cyrillic (1251), Greek (1253), and Turkish (1254).
The module supports multi-language selection for recognition, though it may not recognize languages from different language groups properly. It supports only language combinations within the same Code Page. For example, it properly processes the English, German, and Italian language combination, since all these languages belong to the Latin 1 (1252) code page. However, when specifying both the French and Czech languages, for example, OMNIFONT_FRX may fail to properly recognize some accented characters in the Czech alphabet, since these languages are not in the same code page. The following table contains the languages by code pages supported by FRX.
Latin 2 (1250) |
Polish, Czech, Hungarian, Romanian, Albanian, Croatian, Wend (Sorbian), Slovak, Slovenian |
Cyrillic (1251) |
Russian, Ukrainian, Byelorussian, Bulgarian, Macedonian, Serbian |
Latin 1 (1252) |
English, German, French, Spanish, Italian, Dutch, Swedish, Norwegian, Finnish, Danish, Portuguese, Portuguese (Brazilian), Catalan, Afrikaans, Aymara, Basque, Breton, Faroese, Friulian, Gaelic, Galician, Eskimo, Icelandic, Indonesian, Latin, Malaysian, Pidgin English, Swahili, Tahitian, Welsh, Frisian, Zulu |
Greek (1253) |
Greek |
Turkish (1254) |
Turkish, Kurdish (written in Latin alphabet) |
Baltic (1257) |
Estonian, Hawaiian, Latvian, Lithuanian |
Character Attributes
The omnifont recognition module can detect and transmit character attributes: bold, italic, or underlined text (or any combination of them). It can also detect and transmit character size, and can classify font types into three broad categories: serif, sans serif, and monospaced.